Qualitative test-cost sensitive classification

نویسندگان

  • Mumin Cebe
  • Cigdem Demir
چکیده

QUALITATIVE TEST-COST SENSITIVE CLASSIFICATION Mümin Cebe M.S. in Computer Engineering Supervisor: Assist. Prof. Dr. Çiğdem Gündüz Demir Agust, 2008 Decision making is a procedure for selecting the best action among several alternatives. In many real-world problems, decision has to be taken under the circumstances in which one has to pay to acquire information. In this thesis, we propose a new framework for test-cost sensitive classification that considers the misclassification cost together with the cost of feature extraction, which arises from the effort of acquiring features. This proposed framework introduces two new concepts to test-cost sensitive learning for better modeling the real-world problems: qualitativeness and consistency. First, this framework introduces the incorporation of qualitative costs into the problem formulation. This incorporation becomes important for many real world problems, from finance to medical diagnosis, since the relation between the misclassification cost and the cost of feature extraction could be expressed only roughly and typically in terms of ordinal relations for these problems. For example, in cancer diagnosis, it could be expressed that the cost of misdiagnosis is larger than the cost of a medical test. However, in the test-cost sensitive classification literature, the misclassification cost and the cost of feature extraction are combined quantitatively to obtain a single loss/utility value, which requires expressing the relation between these costs as a precise quantitative number. Second, it considers the consistency between the current information and the information after feature extraction to decide which features to extract. For example, it does not extract a new feature if it brings no new information but just confirms the current one; in other words, if the new feature is totally consistent with the current information. By doing so, the proposed framework could significantly decrease the cost of feature extraction, and hence, the overall cost without decreasing the classification accuracy. Such consistency behavior has not been iii

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Proposing a Novel Cost Sensitive Imbalanced Classification Method based on Hybrid of New Fuzzy Cost Assigning Approaches, Fuzzy Clustering and Evolutionary Algorithms

In this paper, a new hybrid methodology is introduced to design a cost-sensitive fuzzy rule-based classification system. A novel cost metric is proposed based on the combination of three different concepts: Entropy, Gini index and DKM criterion. In order to calculate the effective cost of patterns, a hybrid of fuzzy c-means clustering and particle swarm optimization algorithm is utilized. This ...

متن کامل

A New Formulation for Cost-Sensitive Two Group Support Vector Machine with Multiple Error Rate

Support vector machine (SVM) is a popular classification technique which classifies data using a max-margin separator hyperplane. The normal vector and bias of the mentioned hyperplane is determined by solving a quadratic model implies that SVM training confronts by an optimization problem. Among of the extensions of SVM, cost-sensitive scheme refers to a model with multiple costs which conside...

متن کامل

Soft Methodology for Cost-and-error Sensitive Classification

Many real-world data mining applications need varying cost for different types of classification errors and thus call for cost-sensitive classification algorithms. Existing algorithms for cost-sensitive classification are successful in terms of minimizing the cost, but can result in a high error rate as the trade-off. The high error rate holds back the practical use of those algorithms. In this...

متن کامل

Ensemble Classification and Extended Feature Selection for Credit Card Fraud Detection

Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...

متن کامل

Dynamic Cost-sensitive Naive Bayes Classification for Uncertain Data

The uncertain data as an important aspect of data mining, has received considerable attention, due to its importance in many applications, but little study has been paid to the cost-sensitive classification on uncertain data, so this paper proposes the dynamic costsensitive Naive Bayes classification for mining uncertain data (DCSUNB). Firstly, we apply the probability density to dispose uncert...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Pattern Recognition Letters

دوره 31  شماره 

صفحات  -

تاریخ انتشار 2010